Add cut-point rules for specific functions / intrinsics (via definition)) by dkcumming · Pull Request #960 · runtimeverification/mir-semantics

dkcumming · 2026-02-28T18:15:56Z

This PR builds upon #931 modifying the approach in response to the comments on that PR. For full context read #931 first.

The kmir prove-rs flag --break-on-function is implemented in this PR as a compiled definition with hooked function to retrieve the function names to match on. This is similar to the already existing pattern that compiles the static data of a KMIR configuration into the definition. This allows for functions to be provided both when creating the initial proof, and when reading from disc (triggers a recompile of llvm if different flags are provided).

I added a test to demonstrate this working on functions and intrinsics, only matching those provided. I do not have a test from reading a partial proof and adding different function names - I did test it but it seemed a bit overboard for a test just now.

I did try the method with K shell access impure function, however this created branching for every function call since the result was stored in a symbolic value. I couldn't figure out how to get that working concretely (I don't think it is possible but might be wrong).

The `breakOnFunctions` cell will contain identifiers of functions that should be broken on by the `termCallFunctionFilter` rule. Which is identical to `termCallFunction` except it checks if the id is in the `breakOnFunctions` cell and has a different identifier for the cut point rules.

Fully qualified paths may have commas. Instead of using a separator use multiple occurances of the flag

…cut-points

2 intrinsics and 2 functions are added, 1 of each is broken on only and should appear in the output

Since the config does not change now there is no value in changing the test files as it now only would change the non-deterministic tys

Stevengre

Thanks for providing this feature. But it looks like the changes in kast.py, test_decode_value.py and iterator-enumerate-fail.rs are unrelated files in this PR. Additionally, the filter may cost more time for execution. Have you tested the influence of it? I don't know if it's worth to introduce such a complexity and fallback for this feature. Maye be we should leave a debug branch just for these features?

dkcumming · 2026-03-02T04:01:38Z

Yes kast.py and test_decode_value.py I didn't revert back properly, I will fix that now. And iterator-enumerate-fail.rs was a different test I didn't meant to commit. I will fix that now

I will run with and without this for a handful of tests and see the timing difference. We could leave it on a debug branch, however from what @palinatolmach suggested this is already functionality that exists in Kontrol (although implemented differently), I feel it would be better to provide this to ourselves and any other potential users without the friction of accessing it through a debug branch. Of course if there is a significant timing regression we have to weigh that tradeoff, so I will get that information and post it here.

dkcumming · 2026-03-02T10:58:52Z

Integration Tests MIR Semantics

I didn't filter the tests properly so I ran 1 extra one on this branch, but even with the extra test the times are pretty much the same.

daniel@daniel-MS-7E06 mir-semantics$ git rev-parse @
e23e477a3b53d1a6c4615a966ecf6f38ae883fb3
daniel@daniel-MS-7E06 mir-semantics$ time make test-integration
//-snip-
============================= 321 passed, 7 skipped in 3007.41s (0:50:07) ==============================

real    50m10.432s
user    284m3.711s
sys     8m32.724s

daniel@daniel-MS-7E06 mir-semantics-filter$ git rev-parse @
bffe2d185653f23e5dea1733d32908900c405fb6
daniel@daniel-MS-7E06 mir-semantics-filter$ time make test-integration TEST_ARGS='-k "not test_cli_break_on_function"'
============================= 322 passed, 7 skipped in 3043.85s (0:50:43) ==============================

real    50m44.602s
user    286m15.304s
sys     8m33.420s

P-Token Tests

	master	this PR
Approve single-thread	277	290
Revoke single-thread	214	225
MintTo single-thread	1290	1260
Burn --max-workers 4	2694	2695

jberthold

LGTM overall.
I agree this should be tested and not merged if it causes a major performance regression. Would be surprised if it was causing issues, though, as the function evaluation should be fast, probably faster than getting an environment variable.

…sion

ehildenb · 2026-03-02T14:54:54Z

kmir/src/kmir/kdist/mir-semantics/kmir.md

+  syntax String ::= getFunctionName(MonoItemKind) [function, total]
+  //---------------------------------------------------------------
+  rule getFunctionName(monoItemFn(symbol(NAME), _, _)) => NAME
+  rule getFunctionName(monoItemStatic(symbol(NAME), _, _)) => NAME
+  rule getFunctionName(monoItemGlobalAsm(_)) => ""
+  rule getFunctionName(IntrinsicFunction(symbol(NAME))) => NAME
+
+  // Check whether a function name matches any filter in the break-on-functions list.
+  syntax Bool ::= #functionNameMatchesEnv(String) [function, total]
+  //----------------------------------------------------------------
+  rule #functionNameMatchesEnv(NAME) => #functionNameMatchesEnvStr(NAME, #breakOnFunctionsString(0))
+
+  // The Int argument is unused; it exists only so the Haskell backend can
+  // pattern-match on it and not error since zero-argument functions cannot use [owise].
+  syntax String ::= #breakOnFunctionsString(Int) [function, total, symbol(breakOnFunctionsString)]
+  //-----------------------------------------------------------------------------------------------
+  rule #breakOnFunctionsString(_) => "" [owise] // This gets overridden by corresponding python function
+
+  syntax Bool ::= #functionNameMatchesEnvStr(String, String) [function, total]
+  //--------------------------------------------------------------------------
+  rule #functionNameMatchesEnvStr(_, "") => false
+  rule #functionNameMatchesEnvStr(NAME, ENV) => #functionNameMatchesAnyList(NAME, #splitSemicolon(ENV))
+    requires ENV =/=String ""
+
+  syntax List ::= #splitSemicolon(String) [function, total]
+  //--------------------------------------------------------
+  rule #splitSemicolon(S) => #splitSemicolonAux(S, findString(S, ";", 0))
+
+  syntax List ::= #splitSemicolonAux(String, Int) [function, total]
+  //-----------------------------------------------------------------
+  rule #splitSemicolonAux(S, -1) => ListItem(S)
+  rule #splitSemicolonAux(S, I) =>
+      ListItem(substrString(S, 0, I)) #splitSemicolon(substrString(S, I +Int 1, lengthString(S)))
+    requires I >=Int 0
+
+  syntax Bool ::= #functionNameMatchesAnyList(String, List) [function, total]
+  //-------------------------------------------------------------------------
+  rule #functionNameMatchesAnyList(_, .List) => false
+  rule #functionNameMatchesAnyList(NAME, ListItem(FILTER:String) REST) =>
+      0 <=Int findString(NAME, FILTER, 0) orBool #functionNameMatchesAnyList(NAME, REST)
+  rule #functionNameMatchesAnyList(_, _) => false [owise]
+


These all seem like pure functions that could be moved to a helper module to keep this file cleaner/smaller. Is there a utils.md file we could move it to in order to keep this file cleaner?

We do not have a utils.md but we definitely need one as we have many things scattered. I think that will be something done in another PR as it will be more comprehensive, I will co-ordinate with @Stevengre @mariaKt on this!

I think we do need some refactor to make the semantics easier to access with better structure. Should we write a proposal about how to manage and how to migrate?

ehildenb · 2026-03-02T14:55:55Z

Does it automatically detect if the function break arguments have not changed, and if so not rekompile? Or does the user have to manually rekompile with the desired flags?

dkcumming · 2026-03-02T15:05:55Z

If you use the normal uv/pyk interface prove-rs it will only recompile when needed and automatically

dkcumming added 17 commits February 3, 2026 15:37

Added helper to get MonoItemKind name

f60a7b2

Plumbing break_on_function from cli to initial proof configuration

b860983

Changed explicit string match to identifier-contains-substring

2b0582b

Added support for intrinsics too

4b2e037

Support comma separated list of identifiers to match on

4a3c81e

Changed list from comma separated to append.

a709603

Fully qualified paths may have commas. Instead of using a separator use multiple occurances of the flag

Fix KAST in decode test

2030315

Updated test output

81b4e8d

Removed accidentally commited definition

8515c55

Merge remote-tracking branch 'origin/master' into dc/filter-function-…

e1bc16d

…cut-points

Added cli test for --break-on-function

39cd8de

2 intrinsics and 2 functions are added, 1 of each is broken on only and should appear in the output

Removing breakOnFunctions cell from configuration

680f753

Added break-on-functions as static data in defintion

e7a8665

Added dummy argument so haskell backend can match and not error

31790b7

Updated test files (removes breakOnFunction cell and shuffles tys)

fac6cd0

Reverted test files

caf2829

Since the config does not change now there is no value in changing the test files as it now only would change the non-deterministic tys

dkcumming requested review from Stevengre, ehildenb, jberthold, mariaKt, palinatolmach and tothtamas28 February 28, 2026 18:15

dkcumming self-assigned this Feb 28, 2026

Stevengre reviewed Mar 2, 2026

View reviewed changes

Minor fixes to accidental diffs

bffe2d1

jberthold approved these changes Mar 2, 2026

View reviewed changes

Merge branch 'master' into dc/filter-function-cut-point-definiton-ver…

2016e37

…sion

Merge branch 'master' into dc/filter-function-cut-point-definiton-ver…

cd24b27

…sion

ehildenb reviewed Mar 2, 2026

View reviewed changes

dkcumming merged commit 2c84b7f into master Mar 3, 2026
7 checks passed

dkcumming deleted the dc/filter-function-cut-point-definiton-version branch March 3, 2026 02:23

This was referenced Mar 3, 2026

Update feature/p-token with latest master #965

Merged

Updated mir-semantics runtimeverification/solana-token#167

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add cut-point rules for specific functions / intrinsics (via definition))#960

Add cut-point rules for specific functions / intrinsics (via definition))#960
dkcumming merged 20 commits intomasterfrom
dc/filter-function-cut-point-definiton-version

dkcumming commented Feb 28, 2026 •

edited

Loading

Uh oh!

Stevengre left a comment

Uh oh!

dkcumming commented Mar 2, 2026

Uh oh!

dkcumming commented Mar 2, 2026 •

edited

Loading

Uh oh!

jberthold left a comment

Uh oh!

ehildenb Mar 2, 2026

Uh oh!

dkcumming Mar 2, 2026

Uh oh!

Stevengre Mar 3, 2026

Uh oh!

ehildenb commented Mar 2, 2026

Uh oh!

dkcumming commented Mar 2, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

dkcumming commented Feb 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Stevengre left a comment

Choose a reason for hiding this comment

Uh oh!

dkcumming commented Mar 2, 2026

Uh oh!

dkcumming commented Mar 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Integration Tests MIR Semantics

P-Token Tests

Uh oh!

jberthold left a comment

Choose a reason for hiding this comment

Uh oh!

ehildenb Mar 2, 2026

Choose a reason for hiding this comment

Uh oh!

dkcumming Mar 2, 2026

Choose a reason for hiding this comment

Uh oh!

Stevengre Mar 3, 2026

Choose a reason for hiding this comment

Uh oh!

ehildenb commented Mar 2, 2026

Uh oh!

dkcumming commented Mar 2, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

dkcumming commented Feb 28, 2026 •

edited

Loading

dkcumming commented Mar 2, 2026 •

edited

Loading